{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": "tensor([-0.1525, -1.2754]) tensor(8.2324)\n"
    }
   ],
   "source": [
    "import torch\n",
    "from matplotlib import pyplot as plt \n",
    "import numpy as np\n",
    "import random\n",
    "\n",
    "num_inputs = 2\n",
    "num_features = 1000\n",
    "true_w = [2,-3.4]\n",
    "true_b = 4.2\n",
    "features = torch.tensor(np.random.normal(0,1,(num_features,num_inputs)),dtype=torch.float32)\n",
    "labels = true_w[0] * features[:,0] + true_w[1] * features[:,1] + true_b\n",
    "labels += torch.tensor(np.random.normal(0,0.01,size=labels.size()),dtype=torch.float32)\n",
    "\n",
    "print(features[0],labels[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 使用Dataloader读取数据\n",
    "在前面中，我们自定义了`data_iter`来遍历训练样本，在Pytorch中封装了`torch.utils.data`包进行数据的读取与处理。对于简单的数据集来说，可以直接使用`TensorDataset`包装，然后使用`Dataloader`进行读取。 对于复杂或者无法一次加载到内存中的数据计，可以自定义`Dataset`。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": "tensor([[ 0.2746,  0.2303],\n        [ 0.7015,  0.6191],\n        [-0.9822,  1.3436],\n        [ 1.1595, -0.0456],\n        [-0.0541,  1.0804],\n        [-0.3530, -0.2207],\n        [-0.5002,  0.2155],\n        [-1.1924,  0.6273],\n        [-0.4808,  1.0043],\n        [-0.5958,  0.3573]]) tensor([ 3.9597,  3.5205, -2.3312,  6.6775,  0.4075,  4.2445,  2.4669, -0.3224,\n        -0.1832,  1.7917])\n"
    }
   ],
   "source": [
    "import torch.utils.data as Data \n",
    "\n",
    "batch_size = 10\n",
    "dataset = Data.TensorDataset(features,labels)\n",
    "data_iter = Data.DataLoader(dataset,batch_size=batch_size,shuffle=True)\n",
    "\n",
    "for X,y in data_iter:\n",
    "    print(X,y)\n",
    "    break"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 定义网络模型\n",
    "\n",
    "这里自定义一个网络层来实现线性回归。 首先，导入`torch.nn`模块，nn表示**neural networks**,该模块定义了大量神经网络的层。`nn`的核心数据结构为`Module`，是一个抽象的概念，**即可以表示神经网络中的某个层（layer），也可以表示一个包含很多层的的网络。\n",
    "\n",
    "在实际使用中，通常可以继承`nn.Module`，实现自定义的网络/层。一个`nn.Module`实例应该包含一些层以及返回输出结果的前向传播(forward)方法。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": "LinearNet(\n  (linear): Linear(in_features=2, out_features=1, bias=True)\n)\n"
    }
   ],
   "source": [
    "\n",
    "import torch.nn as nn\n",
    "class LinearNet(nn.Module):\n",
    "    def __init__(self,n_feature):\n",
    "        super(LinearNet,self).__init__()\n",
    "        self.linear = nn.Linear(n_feature,1)\n",
    "\n",
    "    def forward(self,x):\n",
    "        y = self.linear(x)\n",
    "        return y\n",
    "\n",
    "net = LinearNet(num_inputs)\n",
    "print(net)\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "可以使用PyTorch提供的容器`Sequential`来方便快捷的搭建网络。`Sequential`是一个有序容器，网络层将按照加入到`Sequential`的顺序被依次添加到计算图中。\n",
    "```\n",
    "# 写法一\n",
    "net = nn.Sequential(\n",
    "    nn.Linear(num_inputs,1)\n",
    ")\n",
    "# 写法二\n",
    "net = nn.Sequential()\n",
    "net.add_module('linear',nn.Linear(num_inputs,1))\n",
    "# net.add_module ...\n",
    "\n",
    "# 写法三\n",
    "from collection import OrdereDict\n",
    "net = nn.Sequential(OrdereDict([\n",
    "    ('linear',nn.Linear(num_inputs,1))\n",
    "    # ()...\n",
    "    ]))\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 初始化模型参数\n",
    "\n",
    "### 参数的访问"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": "Parameter containing:\ntensor([[ 0.1447, -0.3669]], requires_grad=True)\ntensor([[ 0.1447, -0.3669]])\nParameter containing:\ntensor([-0.5967], requires_grad=True)\ntensor([-0.5967])\n----- named parameters -----\nlinear.weight\nlinear.bias\n"
    }
   ],
   "source": [
    "# 可以调用`parameters()`或者`named_parameters()`来访问网络各个层的参数\n",
    "\n",
    "for param in net.parameters():\n",
    "    print(param)\n",
    "    print(param.data)\n",
    "\n",
    "print(\"----- named parameters -----\")\n",
    "for name,param in net.named_parameters():\n",
    "    print(name)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`parameters()`返回包含模型所有参数的迭代器。每个模型参数包括两部分：`data`参数的值和`requires_grad`指示是否自动求导。在上一个实现中，自定义参数的时候，需要手动的制定其`requreid_grad`属性.\n",
    "```\n",
    "w = torch.tensor(np.random.normal(0,0.01,(num_inputs,1)),dtype=torch.float32)\n",
    "b = torch.zeros(1,dtype=torch.float32)\n",
    "\n",
    "w.requires_grad_(requires_grad=True)\n",
    "b.requires_grad_(requires_grad=True)\n",
    "```\n",
    "`parameters()`通常作为参数传递给`optimizer`使用。_\n",
    "\n",
    "`name_parameters()`不仅仅返回模型参数的，还返回该参数的名称，例如：`linear.weight`。 这样在微调中有选择的加载预训练模型就比较方便，可以根据其名称有选择的加载或者不记载。\n",
    "\n",
    "\n",
    "### 参数初始化\n",
    "模型训练前要对其参数进行初始化，对于bias参数可以将其设置为0。对于权重则不能这么简单的处理，PyTorch的`init`模块中提供了多种参数初始化方法。下面使用`init.normal_`将权重初始化为均值为0，标准差为0.01的正态分布"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": "Parameter containing:\ntensor([[ 0.0013, -0.0009]], requires_grad=True)\nParameter containing:\ntensor([0.], requires_grad=True)\n"
    }
   ],
   "source": [
    "from torch.nn import init\n",
    "\n",
    "for name,param in net.named_parameters():\n",
    "    if \"weight\" in name:\n",
    "        init.normal_(param.data,mean=0,std=0.01)\n",
    "    elif \"bias\" in name:\n",
    "        init.constant_(param.data,val=0)\n",
    "\n",
    "for param in net.parameters():\n",
    "    print(param)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 损失函数\n",
    "\n",
    "`nn`模块中提供了各种损失函数，损失函数可以作为神经网络中的一个**特殊的层**,这些损失函数也是`nn.Module`的子类"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [],
   "source": [
    "loss = nn.MSELoss()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 优化方法\n",
    "`torch.optim`提供了很多常用的优化算法比如SGD,Adam以及RMSPro等。 下面使用SGD并将其学习率指定为0.03."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": "SGD (\nParameter Group 0\n    dampening: 0\n    lr: 0.03\n    momentum: 0\n    nesterov: False\n    weight_decay: 0\n)\n"
    }
   ],
   "source": [
    "import torch.optim as optim\n",
    "\n",
    "optimizer = optim.SGD(net.parameters(),lr=0.03)\n",
    "print(optimizer)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "还可以为不同的网络层设置不同的学习率，这在finetune的时候是非常有用的。\n",
    "```\n",
    "optimizer = optim.SGD([\n",
    "    {'params':net.subnet1.parameters()}, # 没有指定learningrate就是用默认的学习率\n",
    "    {'params':net.subnet2.parameters(),lr=0.01}\n",
    "],lr=0.03)\n",
    "```\n",
    "\n",
    "在有些情况下，不想将学习率固定为常数，有两种方法可以对其进行调整：\n",
    "- 修改`optimizer.param_groups`中对应的学习率\n",
    "- 新建一个优化器，由于`optimizer`是轻量级的，构建的开销很小，但是这样做对使用动量的优化器（如Adam），会丢失动量信息，造成损失函数的震荡。_\n",
    "```\n",
    "for parm_group in optimizer.param_groups:\n",
    "    param_group[\"lr\"] *= 0.01 # 学习率为之前的0.01\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 模型训练\n",
    "\n",
    "进行模型训练时，通过调用`optimizer`的`step`函数进行迭代"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": "epoch 1,loss:0.000052\nepoch 2,loss:0.000066\nepoch 3,loss:0.000078\nlinear.weight\ntensor([[ 1.9997, -3.4001]])\nlinear.bias\ntensor([4.1989])\n"
    }
   ],
   "source": [
    "num_epochs = 3\n",
    "for epoch in range(0,num_epochs):\n",
    "    for X,y in data_iter:\n",
    "        output= net(X)\n",
    "        l = loss(output,y.view(-1,1))\n",
    "        optimizer.zero_grad() # 梯度清零\n",
    "        l.backward() # 反向传播\n",
    "        optimizer.step()\n",
    "\n",
    "    print(\"epoch %d,loss:%f\" %(epoch + 1,l.item()))\n",
    "\n",
    "# 输出参数值\n",
    "for name,param in net.named_parameters():\n",
    "    print(name)\n",
    "    print(param.data)\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 总结\n",
    "\n",
    "PyTorch使用流程：\n",
    "\n",
    "- **训练数据的搜集处理**，对于简单的数据可以直接使用`TensorDataset`。通常数据集无法全部加载到内存中，可以自定义`Dataset`，然后使用`Dataloader`迭代读取。\n",
    "- **模型的定义**，网络结构的实现。可以自定义`nn.Module`，也可以使用PyTorch提供的容器类`Sequeantial`或者`ModuleList`，依次添加网络层。\n",
    "- **模型参数初始化**，`torch.init`模块提供了各种样的参数初始化方法。 这里可以使用`parameters()`或者`named_parameters()`来访问网络层的参数。 \n",
    "- **损失函数定义**， 损失函数可以作为一个特殊层存在于网络中，在`nn`模块提供了各种常见的损失函数。\n",
    "- **定义优化算法** ，`torch.optim`模块提供了各种常用的优化算法，例如：SGD，Adam以及RMSprop等。PyTorch也提供了一些动态修改参数的机制。\n",
    "- **训练** 其训练过程可以使用如下代码模板\n",
    "    ```\n",
    "    num_epochs = 3\n",
    "    for epoch in range(0,num_epochs):\n",
    "        for X,y in data_iter:\n",
    "            output= net(X)\n",
    "            l = loss(output,y.view(-1,1))\n",
    "            optimizer.zero_grad() # 梯度清零\n",
    "            l.backward() # 反向传播\n",
    "            optimizer.step()\n",
    "\n",
    "        print(\"epoch %d,loss:%f\" %(epoch + 1,l.item()))\n",
    "    ```\n",
    "    在每个epoch中调用`Dataloader`得到每个batch的训练样本，\n",
    "    - 使用模型对样本数据进行正向传播`net(x)`，得到当前模型预测结果。 \n",
    "    - 调用损失函数`loss`计算当前模型的预测结果$\\hat{y}$ 和样本标签$y$的误差。\n",
    "    - 调用`backward`进行反向传播，计算各个节点的梯度\n",
    "    - `step`更新网络层的各个参数"
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.7-final"
  },
  "orig_nbformat": 2,
  "kernelspec": {
   "name": "python3",
   "display_name": "Python 3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}